AITopics | loss barrier

d9dc5573f7368201d6409e07e882aa77-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 10:55:24 GMT

artificial intelligence, international conference, machine learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
(7 more...)

Add feedback

On permutation symmetries in Bayesian neural network posteriors: a variational perspective

Neural Information Processing SystemsDec-26-2025, 22:20:49 GMT

The elusive nature of gradient-based optimization in neural networks is tied to their loss landscape geometry, which is poorly understood. However recent work has brought solid evidence that there is essentially no loss barrier between the local solutions of gradient descent, once accounting for weight-permutations that leave the network's computation unchanged. This raises questions for approximate inference in Bayesian neural networks (BNNs), where we are interested in marginalizing over multiple points in the loss landscape.In this work, we first extend the formalism of marginalized loss barrier and solution interpolation to BNNs, before proposing a matching algorithm to search for linearly connected solutions. This is achieved by aligning the distributions of two independent approximate Bayesian solutions with respect to permutation matrices.

bayesian neural network posterior, name change, permutation symmetry, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

d9dc5573f7368201d6409e07e882aa77-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 09:01:04 GMT

artificial intelligence, international conference, machine learning, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
(10 more...)

Add feedback

A Experimental Details

Neural Information Processing SystemsAug-16-2025, 02:49:55 GMT

Figure 8 shows the results.

artificial intelligence, machine learning, subset, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

77dd8e90fe833eba5fae86cf017d7a56-Paper-Conference.pdf

Neural Information Processing SystemsAug-16-2025, 02:49:48 GMT

artificial intelligence, initialization, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Generalized Linear Mode Connectivity for Transformers

Theus, Alexander, Cabodi, Alessandro, Anagnostidis, Sotiris, Orvieto, Antonio, Singh, Sidak Pal, Boeva, Valentina

arXiv.org Machine LearningJul-1-2025

Understanding the geometry of neural network loss landscapes is a central question in deep learning, with implications for generalization and optimization. A striking phenomenon is linear mode connectivity (LMC), where independently trained models can be connected by low- or zero-loss paths, despite appearing to lie in separate loss basins. However, this is often obscured by symmetries in parameter space -- such as neuron permutations -- which make functionally equivalent models appear dissimilar. Prior work has predominantly focused on neuron re-ordering through permutations, but such approaches are limited in scope and fail to capture the richer symmetries exhibited by modern architectures such as Transformers. In this work, we introduce a unified framework that captures four symmetry classes: permutations, semi-permutations, orthogonal transformations, and general invertible maps -- broadening the set of valid reparameterizations and subsuming many previous approaches as special cases. Crucially, this generalization enables, for the first time, the discovery of low- and zero-barrier linear interpolation paths between independently trained Vision Transformers and GPT-2 models. These results reveal deeper structure in the loss landscape and underscore the importance of symmetry-aware analysis for understanding model space geometry.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2506.22712

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions

Kwok, Devin, Altıntaş, Gül Sena, Raffel, Colin, Rolnick, David

arXiv.org Artificial IntelligenceJun-17-2025

Neural network training is inherently sensitive to initialization and the randomness induced by stochastic gradient descent. However, it is unclear to what extent such effects lead to meaningfully different networks, either in terms of the models' weights or the underlying functions that were learned. In this work, we show that during the initial "chaotic" phase of training, even extremely small perturbations reliably causes otherwise identical training trajectories to diverge-an effect that diminishes rapidly over training time. We quantify this divergence through (i) $L^2$ distance between parameters, (ii) the loss barrier when interpolating between networks, (iii) $L^2$ and barrier between parameters after permutation alignment, and (iv) representational similarity between intermediate activations; revealing how perturbations across different hyperparameter or fine-tuning settings drive training trajectories toward distinct loss minima. Our findings provide insights into neural network training stability, with practical implications for fine-tuning, model merging, and diversity of model ensembles.

artificial intelligence, machine learning, perturbation, (17 more...)

arXiv.org Artificial Intelligence

2506.13234

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Leveraging Per-Instance Privacy for Machine Unlearning

Sepahvand, Nazanin Mohammadi, Thudi, Anvith, Isik, Berivan, Bhattacharyya, Ashmita, Papernot, Nicolas, Triantafillou, Eleni, Roy, Daniel M., Dziugaite, Gintare Karolina

arXiv.org Artificial IntelligenceMay-27-2025

We present a principled, per-instance approach to quantifying the difficulty of unlearning via fine-tuning. We begin by sharpening an analysis of noisy gradient descent for unlearning (Chien et al., 2024), obtaining a better utility-unlearning tradeoff by replacing worst-case privacy loss bounds with per-instance privacy losses (Thudi et al., 2024), each of which bounds the (Renyi) divergence to retraining without an individual data point. To demonstrate the practical applicability of our theory, we present empirical results showing that our theoretical predictions are born out both for Stochastic Gradient Langevin Dynamics (SGLD) as well as for standard fine-tuning without explicit noise. We further demonstrate that per-instance privacy losses correlate well with several existing data difficulty metrics, while also identifying harder groups of data points, and introduce novel evaluation methods based on loss barriers. All together, our findings provide a foundation for more efficient and adaptive unlearning strategies tailored to the unique properties of individual data points.

artificial intelligence, machine learning, privacy loss, (13 more...)

arXiv.org Artificial Intelligence

2505.18786

Country:

North America > Canada > Quebec (0.28)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Unveiling Mode Connectivity in Graph Neural Networks

Li, Bingheng, Chen, Zhikai, Han, Haoyu, Zeng, Shenglai, Liu, Jingzhe, Tang, Jiliang

arXiv.org Artificial IntelligenceFeb-18-2025

A fundamental challenge in understanding graph neural networks (GNNs) lies in characterizing their optimization dynamics and loss landscape geometry, critical for improving interpretability and robustness. While mode connectivity, a lens for analyzing geometric properties of loss landscapes has proven insightful for other deep learning architectures, its implications for GNNs remain unexplored. This work presents the first investigation of mode connectivity in GNNs. We uncover that GNNs exhibit distinct non-linear mode connectivity, diverging from patterns observed in fully-connected networks or CNNs. Crucially, we demonstrate that graph structure, rather than model architecture, dominates this behavior, with graph properties like homophily correlating with mode connectivity patterns. We further establish a link between mode connectivity and generalization, proposing a generalization bound based on loss barriers and revealing its utility as a diagnostic tool. Our findings further bridge theoretical insights with practical implications: they rationalize domain alignment strategies in graph learning and provide a foundation for refining GNN training paradigms.

artificial intelligence, machine learning, mode connectivity, (14 more...)

arXiv.org Artificial Intelligence

2502.12608

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.66)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

On permutation symmetries in Bayesian neural network posteriors: a variational perspective

Neural Information Processing SystemsJan-20-2025, 00:03:23 GMT

The elusive nature of gradient-based optimization in neural networks is tied to their loss landscape geometry, which is poorly understood. However recent work has brought solid evidence that there is essentially no loss barrier between the local solutions of gradient descent, once accounting for weight-permutations that leave the network's computation unchanged. This raises questions for approximate inference in Bayesian neural networks (BNNs), where we are interested in marginalizing over multiple points in the loss landscape.In this work, we first extend the formalism of marginalized loss barrier and solution interpolation to BNNs, before proposing a matching algorithm to search for linearly connected solutions. This is achieved by aligning the distributions of two independent approximate Bayesian solutions with respect to permutation matrices. We then experiment on a variety of architectures and datasets, finding nearly zero marginalized loss barriers for linearly connected solutions.

bayesian neural network posterior, permutation symmetry, variational perspective, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Filters

Collaborating Authors

loss barrier

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

d9dc5573f7368201d6409e07e882aa77-Paper-Conference.pdf

On permutation symmetries in Bayesian neural network posteriors: a variational perspective

d9dc5573f7368201d6409e07e882aa77-Paper-Conference.pdf

A Experimental Details

77dd8e90fe833eba5fae86cf017d7a56-Paper-Conference.pdf

Generalized Linear Mode Connectivity for Transformers

The Butterfly Effect: Neural Network Training Trajectories Are Highly Sensitive to Initial Conditions

Leveraging Per-Instance Privacy for Machine Unlearning

Unveiling Mode Connectivity in Graph Neural Networks

On permutation symmetries in Bayesian neural network posteriors: a variational perspective